[VPlan] Enable vectorization of early-exit loops with unit-stride fault-only-first loads #151300

arcbbb · 2025-07-30T09:54:24Z

Following #152422, this patch enables auto-vectorization of early-exit loops containing a single potentially faulting, unit-stride load by using the vp.load.ff intrinsic introduced in #128593.

Key changes:

Add VPWidenFFLoadRecipe that produces two results: the loaded vector and VL (the number of lanes actually loaded).
Use VL to ensure correctness:
- Step the induction variable by VL (variable-length stepping).
- Cap AVL to min(VF, remainder) to avoid over-reads at last iteration.
- Compute the early-exit condition from VL by replacing AnyOf with vp.reduce.or to avoid branch-on-poison.
Introduce two transforms:
- adjustFFLoadEarlyExitForPoisonSafety: rewrites the exit condition (AnyOf → vp.reduce.or(VL)) and sets AVL = min(VF, remainder).
- convertFFLoadEarlyExitToVLStepping: after region dissolution, converts early-exit loops to step by VL.

Limitations:

Supports a single potentially faulting load with unit stride.
Interleave count (IC) must be 1.

arcbbb · 2025-07-30T09:54:44Z

Regarding EVL propagation, I just noticed that header mask handling is fixed in #150202. This fix will need to be incorporated accordingly.

github-actions · 2025-07-30T09:57:31Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/include/llvm/CodeGen/SelectionDAG.h

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

alexey-bataev · 2025-07-30T13:01:29Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Check this first?

Updated. Thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

david-arm · 2025-07-30T13:13:43Z

I think it would be good for reviewers if this patch was split up into several parts, probably roughly in this order:

Codegen specific parts related to the RISCV intrinsic,
A standalone loop vectorisation legality patch similar to [LV] Add initial legality checks for early exit loops with side effects #145663,
A loop vectoriser patch that starts vectorising the loops you're interested in.

arcbbb · 2025-08-01T13:05:52Z

I think it would be good for reviewers if this patch was split up into several parts, probably roughly in this order:

Codegen specific parts related to the RISCV intrinsic,

A standalone loop vectorisation legality patch similar to [LV] Add initial legality checks for ee loops with stores #145663,

A loop vectoriser patch that starts vectorising the loops you're interested in.

Thanks for outlining the steps. That’s very helpful! I’ll follow this order for splitting up the patch.

lukel97 · 2025-09-02T04:50:05Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

@@ -7749,6 +7758,12 @@ VPRecipeBuilder::tryToWidenMemory(Instruction *I, ArrayRef<VPValue *> Operands,
    Builder.insert(VectorPtr);
    Ptr = VectorPtr;
  }
+  if (Legal->getSpeculativeLoads().contains(I)) {
+    auto *Load = dyn_cast<LoadInst>(I);
+    return new VPWidenFFLoadRecipe(*Load, Ptr, Mask, VPIRMetadata(*Load, LVer),


Just making a note for later. I think it would be good to avoid having a dead VPWidenFFLoadRecipe when there's no non-EVL version of vp.load.ff.

Can we instead just have one VPWidenFFLoadRecipe and here pass VF as the EVL argument? optimizeMaskToEVL can then set the EVL from the header mask later.

Updated! After rebasing, this patch supports WidenFFLoad in non–tail-folded mode only. To tail-fold early-exit loops and FFLoad support in the EVL transform will have to be addressed in a separate patch.

This patch splits out the legality checks from PR #151300, following the landing of PR #128593. It is a step toward supporting vectorization of early-exit loops that contain potentially faulting loads. In this commit, an early-exit loop is considered legal for vectorization if it satisfies the following criteria: 1. it is a read-only loop. 2. all potentially faulting loads are unit-stride, which is the only type currently supported by vp.load.ff.

llvmbot · 2025-09-21T01:43:02Z

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Shih-Po Hung (arcbbb)

Changes

Following #152422, this patch enables auto-vectorization of early-exit loops containing a single potentially faulting, unit-stride load by using the vp.load.ff intrinsic introduced in #128593.

Key changes:

Add VPWidenFFLoadRecipe that produces two results: the loaded vector and VL (the number of lanes actually loaded).
Use VL to ensure correctness:
- Step the induction variable by VL (variable-length stepping).
- Cap AVL to min(VF, remainder) to avoid over-reads at last iteration.
- Compute the early-exit condition from VL by replacing AnyOf with vp.reduce.or to avoid branch-on-poison.
Introduce two transforms:
- adjustFFLoadEarlyExitForPoisonSafety: rewrites the exit condition (AnyOf → vp.reduce.or(VL)) and sets AVL = min(VF, remainder).
- convertFFLoadEarlyExitToVLStepping: after region dissolution, converts early-exit loops to step by VL.

Limitations:

Supports a single potentially faulting load with unit stride.
Interleave count (IC) must be 1.

Patch is 34.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151300.diff

9 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+41-1)
(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+45)
(modified) llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (+3-2)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+43)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+96)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+11)
(modified) llvm/lib/Transforms/Vectorize/VPlanValue.h (+3)
(modified) llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp (+2-2)
(added) llvm/test/Transforms/LoopVectorize/RISCV/find.ll (+236)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 1d3cffa2b61bf..e28d4c45d4ab8 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -393,6 +393,12 @@ static cl::opt<bool> EnableEarlyExitVectorization(
     cl::desc(
         "Enable vectorization of early exit loops with uncountable exits."));
 
+static cl::opt<bool>
+    EnableEarlyExitWithFFLoads("enable-early-exit-with-ffload", cl::init(false),
+                               cl::Hidden,
+                               cl::desc("Enable vectorization of early-exit "
+                                        "loops with fault-only-first loads."));
+
 static cl::opt<bool> ConsiderRegPressure(
     "vectorizer-consider-reg-pressure", cl::init(false), cl::Hidden,
     cl::desc("Discard VFs if their register pressure is too high."));
@@ -3507,6 +3513,15 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     return FixedScalableVFPair::getNone();
   }
 
+  if (!Legal->getPotentiallyFaultingLoads().empty() && UserIC > 1) {
+    reportVectorizationFailure("Auto-vectorization of loops with potentially "
+                               "faulting loads is not supported when the "
+                               "interleave count is more than 1",
+                               "CantInterleaveLoopWithPotentiallyFaultingLoads",
+                               ORE, TheLoop);
+    return FixedScalableVFPair::getNone();
+  }
+
   ScalarEvolution *SE = PSE.getSE();
   ElementCount TC = getSmallConstantTripCount(SE, TheLoop);
   unsigned MaxTC = PSE.getSmallConstantMaxTripCount();
@@ -4076,6 +4091,7 @@ static bool willGenerateVectors(VPlan &Plan, ElementCount VF,
       case VPDef::VPReductionPHISC:
       case VPDef::VPInterleaveEVLSC:
       case VPDef::VPInterleaveSC:
+      case VPDef::VPWidenFFLoadSC:
       case VPDef::VPWidenLoadEVLSC:
       case VPDef::VPWidenLoadSC:
       case VPDef::VPWidenStoreEVLSC:
@@ -4550,6 +4566,10 @@ LoopVectorizationPlanner::selectInterleaveCount(VPlan &Plan, ElementCount VF,
   if (!Legal->isSafeForAnyVectorWidth())
     return 1;
 
+  // No interleaving for potentially faulting loads.
+  if (!Legal->getPotentiallyFaultingLoads().empty())
+    return 1;
+
   // We don't attempt to perform interleaving for loops with uncountable early
   // exits because the VPInstruction::AnyOf code cannot currently handle
   // multiple parts.
@@ -7216,6 +7236,9 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
   // Regions are dissolved after optimizing for VF and UF, which completely
   // removes unneeded loop regions first.
   VPlanTransforms::dissolveLoopRegions(BestVPlan);
+
+  VPlanTransforms::convertFFLoadEarlyExitToVLStepping(BestVPlan);
+
   // Canonicalize EVL loops after regions are dissolved.
   VPlanTransforms::canonicalizeEVLLoops(BestVPlan);
   VPlanTransforms::materializeBackedgeTakenCount(BestVPlan, VectorPH);
@@ -7598,6 +7621,10 @@ VPRecipeBuilder::tryToWidenMemory(Instruction *I, ArrayRef<VPValue *> Operands,
     Builder.insert(VectorPtr);
     Ptr = VectorPtr;
   }
+  if (Legal->getPotentiallyFaultingLoads().contains(I))
+    return new VPWidenFFLoadRecipe(*cast<LoadInst>(I), Ptr, &Plan.getVF(), Mask,
+                                   VPIRMetadata(*I, LVer), I->getDebugLoc());
+
   if (LoadInst *Load = dyn_cast<LoadInst>(I))
     return new VPWidenLoadRecipe(*Load, Ptr, Mask, Consecutive, Reverse,
                                  VPIRMetadata(*Load, LVer), I->getDebugLoc());
@@ -8632,6 +8659,10 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
       if (Recipe->getNumDefinedValues() == 1) {
         SingleDef->replaceAllUsesWith(Recipe->getVPSingleValue());
         Old2New[SingleDef] = Recipe->getVPSingleValue();
+      } else if (isa<VPWidenFFLoadRecipe>(Recipe)) {
+        VPValue *Data = Recipe->getVPValue(0);
+        SingleDef->replaceAllUsesWith(Data);
+        Old2New[SingleDef] = Data;
       } else {
         assert(Recipe->getNumDefinedValues() == 0 &&
                "Unexpected multidef recipe");
@@ -8679,6 +8710,8 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
   // Adjust the recipes for any inloop reductions.
   adjustRecipesForReductions(Plan, RecipeBuilder, Range.Start);
 
+  VPlanTransforms::adjustFFLoadEarlyExitForPoisonSafety(*Plan);
+
   // Apply mandatory transformation to handle FP maxnum/minnum reduction with
   // NaNs if possible, bail out otherwise.
   if (!VPlanTransforms::runPass(VPlanTransforms::handleMaxMinNumReductions,
@@ -9869,7 +9902,14 @@ bool LoopVectorizePass::processLoop(Loop *L) {
     return false;
   }
 
-  if (!LVL.getPotentiallyFaultingLoads().empty()) {
+  if (EnableEarlyExitWithFFLoads) {
+    if (LVL.getPotentiallyFaultingLoads().size() > 1) {
+      reportVectorizationFailure("Auto-vectorization of loops with more than 1 "
+                                 "potentially faulting load is not enabled",
+                                 "MoreThanOnePotentiallyFaultingLoad", ORE, L);
+      return false;
+    }
+  } else if (!LVL.getPotentiallyFaultingLoads().empty()) {
     reportVectorizationFailure("Auto-vectorization of loops with potentially "
                                "faulting load is not supported",
                                "PotentiallyFaultingLoadsNotSupported", ORE, L);
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index f79855f7e2c5f..6e28c95ca601a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -563,6 +563,7 @@ class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
     case VPRecipeBase::VPInterleaveEVLSC:
     case VPRecipeBase::VPInterleaveSC:
     case VPRecipeBase::VPIRInstructionSC:
+    case VPRecipeBase::VPWidenFFLoadSC:
     case VPRecipeBase::VPWidenLoadEVLSC:
     case VPRecipeBase::VPWidenLoadSC:
     case VPRecipeBase::VPWidenStoreEVLSC:
@@ -2811,6 +2812,13 @@ class LLVM_ABI_FOR_TEST VPReductionEVLRecipe : public VPReductionRecipe {
             ArrayRef<VPValue *>({R.getChainOp(), R.getVecOp(), &EVL}), CondOp,
             R.isOrdered(), DL) {}
 
+  VPReductionEVLRecipe(RecurKind RdxKind, FastMathFlags FMFs, VPValue *ChainOp,
+                       VPValue *VecOp, VPValue &EVL, VPValue *CondOp,
+                       bool IsOrdered, DebugLoc DL = DebugLoc::getUnknown())
+      : VPReductionRecipe(VPDef::VPReductionEVLSC, RdxKind, FMFs, nullptr,
+                          ArrayRef<VPValue *>({ChainOp, VecOp, &EVL}), CondOp,
+                          IsOrdered, DL) {}
+
   ~VPReductionEVLRecipe() override = default;
 
   VPReductionEVLRecipe *clone() override {
@@ -3159,6 +3167,7 @@ class LLVM_ABI_FOR_TEST VPWidenMemoryRecipe : public VPRecipeBase,
   static inline bool classof(const VPRecipeBase *R) {
     return R->getVPDefID() == VPRecipeBase::VPWidenLoadSC ||
            R->getVPDefID() == VPRecipeBase::VPWidenStoreSC ||
+           R->getVPDefID() == VPRecipeBase::VPWidenFFLoadSC ||
            R->getVPDefID() == VPRecipeBase::VPWidenLoadEVLSC ||
            R->getVPDefID() == VPRecipeBase::VPWidenStoreEVLSC;
   }
@@ -3240,6 +3249,42 @@ struct LLVM_ABI_FOR_TEST VPWidenLoadRecipe final : public VPWidenMemoryRecipe,
   }
 };
 
+/// A recipe for widening loads using fault-only-first intrinsics.
+/// Produces two results: (1) the loaded data, and (2) the index of the first
+/// non-dereferenceable lane, or VF if all lanes are successfully read.
+struct VPWidenFFLoadRecipe final : public VPWidenMemoryRecipe, public VPValue {
+  VPWidenFFLoadRecipe(LoadInst &Load, VPValue *Addr, VPValue *VF, VPValue *Mask,
+                      const VPIRMetadata &Metadata, DebugLoc DL)
+      : VPWidenMemoryRecipe(VPDef::VPWidenFFLoadSC, Load, {Addr, VF},
+                            /*Consecutive*/ true, /*Reverse*/ false, Metadata,
+                            DL),
+        VPValue(this, &Load) {
+    new VPValue(nullptr, this); // Index of the first lane that faults.
+    setMask(Mask);
+  }
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenFFLoadSC);
+
+  /// Return the VF operand.
+  VPValue *getVF() const { return getOperand(1); }
+  void setVF(VPValue *V) { setOperand(1, V); }
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+             VPSlotTracker &SlotTracker) const override;
+#endif
+
+  /// Returns true if the recipe only uses the first lane of operand \p Op.
+  bool onlyFirstLaneUsed(const VPValue *Op) const override {
+    assert(is_contained(operands(), Op) &&
+           "Op must be an operand of the recipe");
+    return Op == getVF() || Op == getAddr();
+  }
+};
+
 /// A recipe for widening load operations with vector-predication intrinsics,
 /// using the address to load from, the explicit vector length and an optional
 /// mask.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 46ab7712e2671..684dbd25597e3 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -188,8 +188,9 @@ Type *VPTypeAnalysis::inferScalarTypeForRecipe(const VPWidenCallRecipe *R) {
 }
 
 Type *VPTypeAnalysis::inferScalarTypeForRecipe(const VPWidenMemoryRecipe *R) {
-  assert((isa<VPWidenLoadRecipe, VPWidenLoadEVLRecipe>(R)) &&
-         "Store recipes should not define any values");
+  assert(
+      (isa<VPWidenLoadRecipe, VPWidenFFLoadRecipe, VPWidenLoadEVLRecipe>(R)) &&
+      "Store recipes should not define any values");
   return cast<LoadInst>(&R->getIngredient())->getType();
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 8e9c3db50319f..3da8613a1d3cc 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -73,6 +73,7 @@ bool VPRecipeBase::mayWriteToMemory() const {
   case VPReductionPHISC:
   case VPScalarIVStepsSC:
   case VPPredInstPHISC:
+  case VPWidenFFLoadSC:
     return false;
   case VPBlendSC:
   case VPReductionEVLSC:
@@ -107,6 +108,7 @@ bool VPRecipeBase::mayReadFromMemory() const {
     return cast<VPInstruction>(this)->opcodeMayReadOrWriteFromMemory();
   case VPWidenLoadEVLSC:
   case VPWidenLoadSC:
+  case VPWidenFFLoadSC:
     return true;
   case VPReplicateSC:
     return cast<Instruction>(getVPSingleValue()->getUnderlyingValue())
@@ -3409,6 +3411,47 @@ void VPWidenLoadRecipe::print(raw_ostream &O, const Twine &Indent,
 }
 #endif
 
+void VPWidenFFLoadRecipe::execute(VPTransformState &State) {
+  Type *ScalarDataTy = getLoadStoreType(&Ingredient);
+  auto *DataTy = VectorType::get(ScalarDataTy, State.VF);
+  const Align Alignment = getLoadStoreAlignment(&Ingredient);
+
+  auto &Builder = State.Builder;
+  State.setDebugLocFrom(getDebugLoc());
+
+  Value *VL = State.get(getVF(), VPLane(0));
+  Type *I32Ty = Builder.getInt32Ty();
+  VL = Builder.CreateZExtOrTrunc(VL, I32Ty);
+  Value *Addr = State.get(getAddr(), true);
+  Value *Mask = nullptr;
+  if (VPValue *VPMask = getMask())
+    Mask = State.get(VPMask);
+  else
+    Mask = Builder.CreateVectorSplat(State.VF, Builder.getTrue());
+  CallInst *NewLI =
+      Builder.CreateIntrinsic(Intrinsic::vp_load_ff, {DataTy, Addr->getType()},
+                              {Addr, Mask, VL}, nullptr, "vp.op.load.ff");
+  NewLI->addParamAttr(
+      0, Attribute::getWithAlignment(NewLI->getContext(), Alignment));
+  applyMetadata(*NewLI);
+  Value *V = cast<Instruction>(Builder.CreateExtractValue(NewLI, 0));
+  Value *NewVL = Builder.CreateExtractValue(NewLI, 1);
+  State.set(getVPValue(0), V);
+  State.set(getVPValue(1), NewVL, /*NeedsScalar=*/true);
+}
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+void VPWidenFFLoadRecipe::print(raw_ostream &O, const Twine &Indent,
+                                VPSlotTracker &SlotTracker) const {
+  O << Indent << "WIDEN ";
+  printAsOperand(O, SlotTracker);
+  O << ", ";
+  getVPValue(1)->printAsOperand(O, SlotTracker);
+  O << " = vp.load.ff ";
+  printOperands(O, SlotTracker);
+}
+#endif
+
 /// Use all-true mask for reverse rather than actual mask, as it avoids a
 /// dependence w/o affecting the result.
 static Instruction *createReverseEVL(IRBuilderBase &Builder, Value *Operand,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 1f6b85270607e..7e78cb6ed02ac 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2760,6 +2760,102 @@ void VPlanTransforms::addExplicitVectorLength(
   Plan.setUF(1);
 }
 
+void VPlanTransforms::adjustFFLoadEarlyExitForPoisonSafety(VPlan &Plan) {
+  VPBasicBlock *Header = Plan.getVectorLoopRegion()->getEntryBasicBlock();
+  VPWidenFFLoadRecipe *LastFFLoad = nullptr;
+  for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
+           vp_depth_first_deep(Plan.getVectorLoopRegion())))
+    for (VPRecipeBase &R : *VPBB)
+      if (auto *Load = dyn_cast<VPWidenFFLoadRecipe>(&R)) {
+        assert(!LastFFLoad && "Only one FFLoad is supported");
+        LastFFLoad = Load;
+      }
+
+  // Skip if no FFLoad.
+  if (!LastFFLoad)
+    return;
+
+  // Ensure FFLoad does not read past the remainder in the last iteration.
+  // Set AVL to min(VF, remainder).
+  VPBuilder Builder(Header, Header->getFirstNonPhi());
+  VPValue *Remainder = Builder.createNaryOp(
+      Instruction::Sub, {&Plan.getVectorTripCount(), Plan.getCanonicalIV()});
+  VPValue *Cmp =
+      Builder.createICmp(CmpInst::ICMP_ULE, &Plan.getVF(), Remainder);
+  VPValue *AVL = Builder.createSelect(Cmp, &Plan.getVF(), Remainder);
+  LastFFLoad->setVF(AVL);
+
+  // To prevent branch-on-poison, rewrite the early-exit condition to
+  // VPReductionEVLRecipe. Expected pattern here is:
+  //   EMIT vp<%alt.exit.cond> = AnyOf
+  //   EMIT vp<%exit.cond> = or vp<%alt.exit.cond>, vp<%main.exit.cond>
+  //   EMIT branch-on-cond vp<%exit.cond>
+  auto *ExitingLatch = cast<VPBasicBlock>(Plan.getVectorLoopRegion()->getExiting());
+  auto *LatchExitingBr = cast<VPInstruction>(ExitingLatch->getTerminator());
+
+  VPValue *VPAnyOf = nullptr;
+  VPValue *VecOp = nullptr;
+  assert(
+      match(LatchExitingBr,
+            m_BranchOnCond(m_BinaryOr(m_VPValue(VPAnyOf), m_VPValue()))) &&
+      match(VPAnyOf, m_VPInstruction<VPInstruction::AnyOf>(m_VPValue(VecOp))) &&
+      "unexpected exiting sequence in early exit loop");
+
+  VPValue *OpVPEVLI32 = LastFFLoad->getVPValue(1);
+  VPValue *Mask = LastFFLoad->getMask();
+  FastMathFlags FMF;
+  auto *I1Ty = Type::getInt1Ty(Plan.getContext());
+  VPValue *VPZero = Plan.getOrAddLiveIn(ConstantInt::get(I1Ty, 0));
+  DebugLoc DL = VPAnyOf->getDefiningRecipe()->getDebugLoc();
+  auto *NewAnyOf =
+      new VPReductionEVLRecipe(RecurKind::Or, FMF, VPZero, VecOp, *OpVPEVLI32,
+                               Mask, /*IsOrdered*/ false, DL);
+  NewAnyOf->insertBefore(VPAnyOf->getDefiningRecipe());
+  VPAnyOf->replaceAllUsesWith(NewAnyOf);
+
+  // Using FirstActiveLane in the early-exit block is safe,
+  // exiting conditions guarantees at least one valid lane precedes
+  // any poisoned lanes.
+}
+
+void VPlanTransforms::convertFFLoadEarlyExitToVLStepping(VPlan &Plan) {
+  // Find loop header by locating VPWidenFFLoadRecipe.
+  VPWidenFFLoadRecipe *LastFFLoad = nullptr;
+
+  for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
+           vp_depth_first_shallow(Plan.getEntry())))
+    for (VPRecipeBase &R : *VPBB)
+      if (auto *Load = dyn_cast<VPWidenFFLoadRecipe>(&R)) {
+        assert(!LastFFLoad && "Only one FFLoad is supported");
+        LastFFLoad = Load;
+      }
+
+  // Skip if no FFLoad.
+  if (!LastFFLoad)
+    return;
+
+  VPBasicBlock *HeaderVPBB = LastFFLoad->getParent();
+  // Replace IVStep (VFxUF) with returned VL from FFLoad.
+  auto *CanonicalIV = cast<VPPhi>(&*HeaderVPBB->begin());
+  VPValue *Backedge = CanonicalIV->getIncomingValue(1);
+  assert(match(Backedge, m_c_Add(m_Specific(CanonicalIV),
+                                 m_Specific(&Plan.getVFxUF()))) &&
+         "Unexpected canonical iv");
+  VPRecipeBase *CanonicalIVIncrement = Backedge->getDefiningRecipe();
+  VPValue *OpVPEVLI32 = LastFFLoad->getVPValue(1);
+  VPBuilder Builder(HeaderVPBB, HeaderVPBB->getFirstNonPhi());
+  Builder.setInsertPoint(CanonicalIVIncrement);
+  auto *TC = Plan.getTripCount();
+  Type *CanIVTy = TC->isLiveIn()
+                      ? TC->getLiveInIRValue()->getType()
+                      : cast<VPExpandSCEVRecipe>(TC)->getSCEV()->getType();
+  auto *I32Ty = Type::getInt32Ty(Plan.getContext());
+  VPValue *OpVPEVL = Builder.createScalarZExtOrTrunc(
+      OpVPEVLI32, CanIVTy, I32Ty, CanonicalIVIncrement->getDebugLoc());
+
+  CanonicalIVIncrement->setOperand(1, OpVPEVL);
+}
+
 void VPlanTransforms::canonicalizeEVLLoops(VPlan &Plan) {
   // Find EVL loop entries by locating VPEVLBasedIVPHIRecipe.
   // There should be only one EVL PHI in the entire plan.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
index 69452a7e37572..bc5ce3bc43e76 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
@@ -269,6 +269,17 @@ struct VPlanTransforms {
   ///      (branch-on-cond eq AVLNext, 0)
   static void canonicalizeEVLLoops(VPlan &Plan);
 
+  /// Applies to early-exit loops that use FFLoad. FFLoad may yield fewer active
+  /// lanes than VF. To prevent branch-on-poison and over-reads past the vector
+  /// trip count, use the returned VL for both stepping and exit computation.
+  /// Implemented by:
+  ///  - adjustFFLoadEarlyExitForPoisonSafety: replace AnyOf with vp.reduce.or over
+  ///    the first VL lanes; set AVL = min(VF, remainder).
+  ///  - convertFFLoadEarlyExitToVLStepping: after region dissolution, convert
+  ///    early-exit loops to variable-length stepping.
+  static void adjustFFLoadEarlyExitForPoisonSafety(VPlan &Plan);
+  static void convertFFLoadEarlyExitToVLStepping(VPlan &Plan);
+
   /// Lower abstract recipes to concrete ones, that can be codegen'd.
   static void convertToConcreteRecipes(VPlan &Plan);
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanValue.h b/llvm/lib/Transforms/Vectorize/VPlanValue.h
index 0678bc90ef4b5..b2bc430a09686 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanValue.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanValue.h
@@ -40,6 +40,7 @@ class VPUser;
 class VPRecipeBase;
 class VPInterleaveBase;
 class VPPhiAccessors;
+class VPWidenFFLoadRecipe;
 
 // This is the base class of the VPlan Def/Use graph, used for modeling the data
 // flow into, within and out of the VPlan. VPValues can stand for live-ins
@@ -51,6 +52,7 @@ class LLVM_ABI_FOR_TEST VPValue {
   friend class VPInterleaveBase;
   friend class VPlan;
   friend class VPExpressionRecipe;
+  friend class VPWidenFFLoadRecipe;
 
   const unsigned char SubclassID; ///< Subclass identifier (for isa/dyn_cast).
 
@@ -351,6 +353,7 @@ class VPDef {
     VPWidenCastSC,
     VPWidenGEPSC,
     VPWidenIntrinsicSC,
+    VPWidenFFLoadSC,
     VPWidenLoadEVLSC,
     VPWidenLoadSC,
     VPWidenStoreEVLSC,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
index 92caa0b4e51d5..70e6e0d006eb6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
@@ -166,8 +166,8 @@ bool VPlanVerifier::verifyEVLRecipe(const VPInstruction &EVL) const {
           }
           return VerifyEVLUse(*R, 2);
         })
-        .Case<VPWidenLoadEVLRecipe, VPVectorEndPointerRecipe,
-              VPInterleaveEVLRecipe>(
+        .Case<VPWidenLoadEVLRecipe, VPWidenFFLoadRecipe,
+              VPVectorEndPointerRecipe, VPInterleaveEVLRecipe>(
             [&](const VPRecipeBase *R) { return VerifyEVLUse(*R, 1); })
         .Case<VPInstructionWithType>(
             [&](const VPInstructionWithType *S) { return VerifyEVLUse(*S, 0); })
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/find.ll b/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
new file mode 100644
index 0000000000000..f734bd5f53c82
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
@@ -0...
[truncated]

llvmbot · 2025-09-21T01:43:02Z

@llvm/pr-subscribers-backend-risc-v

Author: Shih-Po Hung (arcbbb)

Changes

Following #152422, this patch enables auto-vectorization of early-exit loops containing a single potentially faulting, unit-stride load by using the vp.load.ff intrinsic introduced in #128593.

Key changes:

Add VPWidenFFLoadRecipe that produces two results: the loaded vector and VL (the number of lanes actually loaded).
Use VL to ensure correctness:
- Step the induction variable by VL (variable-length stepping).
- Cap AVL to min(VF, remainder) to avoid over-reads at last iteration.
- Compute the early-exit condition from VL by replacing AnyOf with vp.reduce.or to avoid branch-on-poison.
Introduce two transforms:
- adjustFFLoadEarlyExitForPoisonSafety: rewrites the exit condition (AnyOf → vp.reduce.or(VL)) and sets AVL = min(VF, remainder).
- convertFFLoadEarlyExitToVLStepping: after region dissolution, converts early-exit loops to step by VL.

Limitations:

Supports a single potentially faulting load with unit stride.
Interleave count (IC) must be 1.

Patch is 34.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151300.diff

9 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+41-1)
(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+45)
(modified) llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (+3-2)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+43)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+96)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+11)
(modified) llvm/lib/Transforms/Vectorize/VPlanValue.h (+3)
(modified) llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp (+2-2)
(added) llvm/test/Transforms/LoopVectorize/RISCV/find.ll (+236)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 1d3cffa2b61bf..e28d4c45d4ab8 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -393,6 +393,12 @@ static cl::opt<bool> EnableEarlyExitVectorization(
     cl::desc(
         "Enable vectorization of early exit loops with uncountable exits."));
 
+static cl::opt<bool>
+    EnableEarlyExitWithFFLoads("enable-early-exit-with-ffload", cl::init(false),
+                               cl::Hidden,
+                               cl::desc("Enable vectorization of early-exit "
+                                        "loops with fault-only-first loads."));
+
 static cl::opt<bool> ConsiderRegPressure(
     "vectorizer-consider-reg-pressure", cl::init(false), cl::Hidden,
     cl::desc("Discard VFs if their register pressure is too high."));
@@ -3507,6 +3513,15 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     return FixedScalableVFPair::getNone();
   }
 
+  if (!Legal->getPotentiallyFaultingLoads().empty() && UserIC > 1) {
+    reportVectorizationFailure("Auto-vectorization of loops with potentially "
+                               "faulting loads is not supported when the "
+                               "interleave count is more than 1",
+                               "CantInterleaveLoopWithPotentiallyFaultingLoads",
+                               ORE, TheLoop);
+    return FixedScalableVFPair::getNone();
+  }
+
   ScalarEvolution *SE = PSE.getSE();
   ElementCount TC = getSmallConstantTripCount(SE, TheLoop);
   unsigned MaxTC = PSE.getSmallConstantMaxTripCount();
@@ -4076,6 +4091,7 @@ static bool willGenerateVectors(VPlan &Plan, ElementCount VF,
       case VPDef::VPReductionPHISC:
       case VPDef::VPInterleaveEVLSC:
       case VPDef::VPInterleaveSC:
+      case VPDef::VPWidenFFLoadSC:
       case VPDef::VPWidenLoadEVLSC:
       case VPDef::VPWidenLoadSC:
       case VPDef::VPWidenStoreEVLSC:
@@ -4550,6 +4566,10 @@ LoopVectorizationPlanner::selectInterleaveCount(VPlan &Plan, ElementCount VF,
   if (!Legal->isSafeForAnyVectorWidth())
     return 1;
 
+  // No interleaving for potentially faulting loads.
+  if (!Legal->getPotentiallyFaultingLoads().empty())
+    return 1;
+
   // We don't attempt to perform interleaving for loops with uncountable early
   // exits because the VPInstruction::AnyOf code cannot currently handle
   // multiple parts.
@@ -7216,6 +7236,9 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
   // Regions are dissolved after optimizing for VF and UF, which completely
   // removes unneeded loop regions first.
   VPlanTransforms::dissolveLoopRegions(BestVPlan);
+
+  VPlanTransforms::convertFFLoadEarlyExitToVLStepping(BestVPlan);
+
   // Canonicalize EVL loops after regions are dissolved.
   VPlanTransforms::canonicalizeEVLLoops(BestVPlan);
   VPlanTransforms::materializeBackedgeTakenCount(BestVPlan, VectorPH);
@@ -7598,6 +7621,10 @@ VPRecipeBuilder::tryToWidenMemory(Instruction *I, ArrayRef<VPValue *> Operands,
     Builder.insert(VectorPtr);
     Ptr = VectorPtr;
   }
+  if (Legal->getPotentiallyFaultingLoads().contains(I))
+    return new VPWidenFFLoadRecipe(*cast<LoadInst>(I), Ptr, &Plan.getVF(), Mask,
+                                   VPIRMetadata(*I, LVer), I->getDebugLoc());
+
   if (LoadInst *Load = dyn_cast<LoadInst>(I))
     return new VPWidenLoadRecipe(*Load, Ptr, Mask, Consecutive, Reverse,
                                  VPIRMetadata(*Load, LVer), I->getDebugLoc());
@@ -8632,6 +8659,10 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
       if (Recipe->getNumDefinedValues() == 1) {
         SingleDef->replaceAllUsesWith(Recipe->getVPSingleValue());
         Old2New[SingleDef] = Recipe->getVPSingleValue();
+      } else if (isa<VPWidenFFLoadRecipe>(Recipe)) {
+        VPValue *Data = Recipe->getVPValue(0);
+        SingleDef->replaceAllUsesWith(Data);
+        Old2New[SingleDef] = Data;
       } else {
         assert(Recipe->getNumDefinedValues() == 0 &&
                "Unexpected multidef recipe");
@@ -8679,6 +8710,8 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
   // Adjust the recipes for any inloop reductions.
   adjustRecipesForReductions(Plan, RecipeBuilder, Range.Start);
 
+  VPlanTransforms::adjustFFLoadEarlyExitForPoisonSafety(*Plan);
+
   // Apply mandatory transformation to handle FP maxnum/minnum reduction with
   // NaNs if possible, bail out otherwise.
   if (!VPlanTransforms::runPass(VPlanTransforms::handleMaxMinNumReductions,
@@ -9869,7 +9902,14 @@ bool LoopVectorizePass::processLoop(Loop *L) {
     return false;
   }
 
-  if (!LVL.getPotentiallyFaultingLoads().empty()) {
+  if (EnableEarlyExitWithFFLoads) {
+    if (LVL.getPotentiallyFaultingLoads().size() > 1) {
+      reportVectorizationFailure("Auto-vectorization of loops with more than 1 "
+                                 "potentially faulting load is not enabled",
+                                 "MoreThanOnePotentiallyFaultingLoad", ORE, L);
+      return false;
+    }
+  } else if (!LVL.getPotentiallyFaultingLoads().empty()) {
     reportVectorizationFailure("Auto-vectorization of loops with potentially "
                                "faulting load is not supported",
                                "PotentiallyFaultingLoadsNotSupported", ORE, L);
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index f79855f7e2c5f..6e28c95ca601a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -563,6 +563,7 @@ class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
     case VPRecipeBase::VPInterleaveEVLSC:
     case VPRecipeBase::VPInterleaveSC:
     case VPRecipeBase::VPIRInstructionSC:
+    case VPRecipeBase::VPWidenFFLoadSC:
     case VPRecipeBase::VPWidenLoadEVLSC:
     case VPRecipeBase::VPWidenLoadSC:
     case VPRecipeBase::VPWidenStoreEVLSC:
@@ -2811,6 +2812,13 @@ class LLVM_ABI_FOR_TEST VPReductionEVLRecipe : public VPReductionRecipe {
             ArrayRef<VPValue *>({R.getChainOp(), R.getVecOp(), &EVL}), CondOp,
             R.isOrdered(), DL) {}
 
+  VPReductionEVLRecipe(RecurKind RdxKind, FastMathFlags FMFs, VPValue *ChainOp,
+                       VPValue *VecOp, VPValue &EVL, VPValue *CondOp,
+                       bool IsOrdered, DebugLoc DL = DebugLoc::getUnknown())
+      : VPReductionRecipe(VPDef::VPReductionEVLSC, RdxKind, FMFs, nullptr,
+                          ArrayRef<VPValue *>({ChainOp, VecOp, &EVL}), CondOp,
+                          IsOrdered, DL) {}
+
   ~VPReductionEVLRecipe() override = default;
 
   VPReductionEVLRecipe *clone() override {
@@ -3159,6 +3167,7 @@ class LLVM_ABI_FOR_TEST VPWidenMemoryRecipe : public VPRecipeBase,
   static inline bool classof(const VPRecipeBase *R) {
     return R->getVPDefID() == VPRecipeBase::VPWidenLoadSC ||
            R->getVPDefID() == VPRecipeBase::VPWidenStoreSC ||
+           R->getVPDefID() == VPRecipeBase::VPWidenFFLoadSC ||
            R->getVPDefID() == VPRecipeBase::VPWidenLoadEVLSC ||
            R->getVPDefID() == VPRecipeBase::VPWidenStoreEVLSC;
   }
@@ -3240,6 +3249,42 @@ struct LLVM_ABI_FOR_TEST VPWidenLoadRecipe final : public VPWidenMemoryRecipe,
   }
 };
 
+/// A recipe for widening loads using fault-only-first intrinsics.
+/// Produces two results: (1) the loaded data, and (2) the index of the first
+/// non-dereferenceable lane, or VF if all lanes are successfully read.
+struct VPWidenFFLoadRecipe final : public VPWidenMemoryRecipe, public VPValue {
+  VPWidenFFLoadRecipe(LoadInst &Load, VPValue *Addr, VPValue *VF, VPValue *Mask,
+                      const VPIRMetadata &Metadata, DebugLoc DL)
+      : VPWidenMemoryRecipe(VPDef::VPWidenFFLoadSC, Load, {Addr, VF},
+                            /*Consecutive*/ true, /*Reverse*/ false, Metadata,
+                            DL),
+        VPValue(this, &Load) {
+    new VPValue(nullptr, this); // Index of the first lane that faults.
+    setMask(Mask);
+  }
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenFFLoadSC);
+
+  /// Return the VF operand.
+  VPValue *getVF() const { return getOperand(1); }
+  void setVF(VPValue *V) { setOperand(1, V); }
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+             VPSlotTracker &SlotTracker) const override;
+#endif
+
+  /// Returns true if the recipe only uses the first lane of operand \p Op.
+  bool onlyFirstLaneUsed(const VPValue *Op) const override {
+    assert(is_contained(operands(), Op) &&
+           "Op must be an operand of the recipe");
+    return Op == getVF() || Op == getAddr();
+  }
+};
+
 /// A recipe for widening load operations with vector-predication intrinsics,
 /// using the address to load from, the explicit vector length and an optional
 /// mask.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 46ab7712e2671..684dbd25597e3 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -188,8 +188,9 @@ Type *VPTypeAnalysis::inferScalarTypeForRecipe(const VPWidenCallRecipe *R) {
 }
 
 Type *VPTypeAnalysis::inferScalarTypeForRecipe(const VPWidenMemoryRecipe *R) {
-  assert((isa<VPWidenLoadRecipe, VPWidenLoadEVLRecipe>(R)) &&
-         "Store recipes should not define any values");
+  assert(
+      (isa<VPWidenLoadRecipe, VPWidenFFLoadRecipe, VPWidenLoadEVLRecipe>(R)) &&
+      "Store recipes should not define any values");
   return cast<LoadInst>(&R->getIngredient())->getType();
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 8e9c3db50319f..3da8613a1d3cc 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -73,6 +73,7 @@ bool VPRecipeBase::mayWriteToMemory() const {
   case VPReductionPHISC:
   case VPScalarIVStepsSC:
   case VPPredInstPHISC:
+  case VPWidenFFLoadSC:
     return false;
   case VPBlendSC:
   case VPReductionEVLSC:
@@ -107,6 +108,7 @@ bool VPRecipeBase::mayReadFromMemory() const {
     return cast<VPInstruction>(this)->opcodeMayReadOrWriteFromMemory();
   case VPWidenLoadEVLSC:
   case VPWidenLoadSC:
+  case VPWidenFFLoadSC:
     return true;
   case VPReplicateSC:
     return cast<Instruction>(getVPSingleValue()->getUnderlyingValue())
@@ -3409,6 +3411,47 @@ void VPWidenLoadRecipe::print(raw_ostream &O, const Twine &Indent,
 }
 #endif
 
+void VPWidenFFLoadRecipe::execute(VPTransformState &State) {
+  Type *ScalarDataTy = getLoadStoreType(&Ingredient);
+  auto *DataTy = VectorType::get(ScalarDataTy, State.VF);
+  const Align Alignment = getLoadStoreAlignment(&Ingredient);
+
+  auto &Builder = State.Builder;
+  State.setDebugLocFrom(getDebugLoc());
+
+  Value *VL = State.get(getVF(), VPLane(0));
+  Type *I32Ty = Builder.getInt32Ty();
+  VL = Builder.CreateZExtOrTrunc(VL, I32Ty);
+  Value *Addr = State.get(getAddr(), true);
+  Value *Mask = nullptr;
+  if (VPValue *VPMask = getMask())
+    Mask = State.get(VPMask);
+  else
+    Mask = Builder.CreateVectorSplat(State.VF, Builder.getTrue());
+  CallInst *NewLI =
+      Builder.CreateIntrinsic(Intrinsic::vp_load_ff, {DataTy, Addr->getType()},
+                              {Addr, Mask, VL}, nullptr, "vp.op.load.ff");
+  NewLI->addParamAttr(
+      0, Attribute::getWithAlignment(NewLI->getContext(), Alignment));
+  applyMetadata(*NewLI);
+  Value *V = cast<Instruction>(Builder.CreateExtractValue(NewLI, 0));
+  Value *NewVL = Builder.CreateExtractValue(NewLI, 1);
+  State.set(getVPValue(0), V);
+  State.set(getVPValue(1), NewVL, /*NeedsScalar=*/true);
+}
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+void VPWidenFFLoadRecipe::print(raw_ostream &O, const Twine &Indent,
+                                VPSlotTracker &SlotTracker) const {
+  O << Indent << "WIDEN ";
+  printAsOperand(O, SlotTracker);
+  O << ", ";
+  getVPValue(1)->printAsOperand(O, SlotTracker);
+  O << " = vp.load.ff ";
+  printOperands(O, SlotTracker);
+}
+#endif
+
 /// Use all-true mask for reverse rather than actual mask, as it avoids a
 /// dependence w/o affecting the result.
 static Instruction *createReverseEVL(IRBuilderBase &Builder, Value *Operand,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 1f6b85270607e..7e78cb6ed02ac 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2760,6 +2760,102 @@ void VPlanTransforms::addExplicitVectorLength(
   Plan.setUF(1);
 }
 
+void VPlanTransforms::adjustFFLoadEarlyExitForPoisonSafety(VPlan &Plan) {
+  VPBasicBlock *Header = Plan.getVectorLoopRegion()->getEntryBasicBlock();
+  VPWidenFFLoadRecipe *LastFFLoad = nullptr;
+  for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
+           vp_depth_first_deep(Plan.getVectorLoopRegion())))
+    for (VPRecipeBase &R : *VPBB)
+      if (auto *Load = dyn_cast<VPWidenFFLoadRecipe>(&R)) {
+        assert(!LastFFLoad && "Only one FFLoad is supported");
+        LastFFLoad = Load;
+      }
+
+  // Skip if no FFLoad.
+  if (!LastFFLoad)
+    return;
+
+  // Ensure FFLoad does not read past the remainder in the last iteration.
+  // Set AVL to min(VF, remainder).
+  VPBuilder Builder(Header, Header->getFirstNonPhi());
+  VPValue *Remainder = Builder.createNaryOp(
+      Instruction::Sub, {&Plan.getVectorTripCount(), Plan.getCanonicalIV()});
+  VPValue *Cmp =
+      Builder.createICmp(CmpInst::ICMP_ULE, &Plan.getVF(), Remainder);
+  VPValue *AVL = Builder.createSelect(Cmp, &Plan.getVF(), Remainder);
+  LastFFLoad->setVF(AVL);
+
+  // To prevent branch-on-poison, rewrite the early-exit condition to
+  // VPReductionEVLRecipe. Expected pattern here is:
+  //   EMIT vp<%alt.exit.cond> = AnyOf
+  //   EMIT vp<%exit.cond> = or vp<%alt.exit.cond>, vp<%main.exit.cond>
+  //   EMIT branch-on-cond vp<%exit.cond>
+  auto *ExitingLatch = cast<VPBasicBlock>(Plan.getVectorLoopRegion()->getExiting());
+  auto *LatchExitingBr = cast<VPInstruction>(ExitingLatch->getTerminator());
+
+  VPValue *VPAnyOf = nullptr;
+  VPValue *VecOp = nullptr;
+  assert(
+      match(LatchExitingBr,
+            m_BranchOnCond(m_BinaryOr(m_VPValue(VPAnyOf), m_VPValue()))) &&
+      match(VPAnyOf, m_VPInstruction<VPInstruction::AnyOf>(m_VPValue(VecOp))) &&
+      "unexpected exiting sequence in early exit loop");
+
+  VPValue *OpVPEVLI32 = LastFFLoad->getVPValue(1);
+  VPValue *Mask = LastFFLoad->getMask();
+  FastMathFlags FMF;
+  auto *I1Ty = Type::getInt1Ty(Plan.getContext());
+  VPValue *VPZero = Plan.getOrAddLiveIn(ConstantInt::get(I1Ty, 0));
+  DebugLoc DL = VPAnyOf->getDefiningRecipe()->getDebugLoc();
+  auto *NewAnyOf =
+      new VPReductionEVLRecipe(RecurKind::Or, FMF, VPZero, VecOp, *OpVPEVLI32,
+                               Mask, /*IsOrdered*/ false, DL);
+  NewAnyOf->insertBefore(VPAnyOf->getDefiningRecipe());
+  VPAnyOf->replaceAllUsesWith(NewAnyOf);
+
+  // Using FirstActiveLane in the early-exit block is safe,
+  // exiting conditions guarantees at least one valid lane precedes
+  // any poisoned lanes.
+}
+
+void VPlanTransforms::convertFFLoadEarlyExitToVLStepping(VPlan &Plan) {
+  // Find loop header by locating VPWidenFFLoadRecipe.
+  VPWidenFFLoadRecipe *LastFFLoad = nullptr;
+
+  for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
+           vp_depth_first_shallow(Plan.getEntry())))
+    for (VPRecipeBase &R : *VPBB)
+      if (auto *Load = dyn_cast<VPWidenFFLoadRecipe>(&R)) {
+        assert(!LastFFLoad && "Only one FFLoad is supported");
+        LastFFLoad = Load;
+      }
+
+  // Skip if no FFLoad.
+  if (!LastFFLoad)
+    return;
+
+  VPBasicBlock *HeaderVPBB = LastFFLoad->getParent();
+  // Replace IVStep (VFxUF) with returned VL from FFLoad.
+  auto *CanonicalIV = cast<VPPhi>(&*HeaderVPBB->begin());
+  VPValue *Backedge = CanonicalIV->getIncomingValue(1);
+  assert(match(Backedge, m_c_Add(m_Specific(CanonicalIV),
+                                 m_Specific(&Plan.getVFxUF()))) &&
+         "Unexpected canonical iv");
+  VPRecipeBase *CanonicalIVIncrement = Backedge->getDefiningRecipe();
+  VPValue *OpVPEVLI32 = LastFFLoad->getVPValue(1);
+  VPBuilder Builder(HeaderVPBB, HeaderVPBB->getFirstNonPhi());
+  Builder.setInsertPoint(CanonicalIVIncrement);
+  auto *TC = Plan.getTripCount();
+  Type *CanIVTy = TC->isLiveIn()
+                      ? TC->getLiveInIRValue()->getType()
+                      : cast<VPExpandSCEVRecipe>(TC)->getSCEV()->getType();
+  auto *I32Ty = Type::getInt32Ty(Plan.getContext());
+  VPValue *OpVPEVL = Builder.createScalarZExtOrTrunc(
+      OpVPEVLI32, CanIVTy, I32Ty, CanonicalIVIncrement->getDebugLoc());
+
+  CanonicalIVIncrement->setOperand(1, OpVPEVL);
+}
+
 void VPlanTransforms::canonicalizeEVLLoops(VPlan &Plan) {
   // Find EVL loop entries by locating VPEVLBasedIVPHIRecipe.
   // There should be only one EVL PHI in the entire plan.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
index 69452a7e37572..bc5ce3bc43e76 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
@@ -269,6 +269,17 @@ struct VPlanTransforms {
   ///      (branch-on-cond eq AVLNext, 0)
   static void canonicalizeEVLLoops(VPlan &Plan);
 
+  /// Applies to early-exit loops that use FFLoad. FFLoad may yield fewer active
+  /// lanes than VF. To prevent branch-on-poison and over-reads past the vector
+  /// trip count, use the returned VL for both stepping and exit computation.
+  /// Implemented by:
+  ///  - adjustFFLoadEarlyExitForPoisonSafety: replace AnyOf with vp.reduce.or over
+  ///    the first VL lanes; set AVL = min(VF, remainder).
+  ///  - convertFFLoadEarlyExitToVLStepping: after region dissolution, convert
+  ///    early-exit loops to variable-length stepping.
+  static void adjustFFLoadEarlyExitForPoisonSafety(VPlan &Plan);
+  static void convertFFLoadEarlyExitToVLStepping(VPlan &Plan);
+
   /// Lower abstract recipes to concrete ones, that can be codegen'd.
   static void convertToConcreteRecipes(VPlan &Plan);
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanValue.h b/llvm/lib/Transforms/Vectorize/VPlanValue.h
index 0678bc90ef4b5..b2bc430a09686 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanValue.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanValue.h
@@ -40,6 +40,7 @@ class VPUser;
 class VPRecipeBase;
 class VPInterleaveBase;
 class VPPhiAccessors;
+class VPWidenFFLoadRecipe;
 
 // This is the base class of the VPlan Def/Use graph, used for modeling the data
 // flow into, within and out of the VPlan. VPValues can stand for live-ins
@@ -51,6 +52,7 @@ class LLVM_ABI_FOR_TEST VPValue {
   friend class VPInterleaveBase;
   friend class VPlan;
   friend class VPExpressionRecipe;
+  friend class VPWidenFFLoadRecipe;
 
   const unsigned char SubclassID; ///< Subclass identifier (for isa/dyn_cast).
 
@@ -351,6 +353,7 @@ class VPDef {
     VPWidenCastSC,
     VPWidenGEPSC,
     VPWidenIntrinsicSC,
+    VPWidenFFLoadSC,
     VPWidenLoadEVLSC,
     VPWidenLoadSC,
     VPWidenStoreEVLSC,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
index 92caa0b4e51d5..70e6e0d006eb6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
@@ -166,8 +166,8 @@ bool VPlanVerifier::verifyEVLRecipe(const VPInstruction &EVL) const {
           }
           return VerifyEVLUse(*R, 2);
         })
-        .Case<VPWidenLoadEVLRecipe, VPVectorEndPointerRecipe,
-              VPInterleaveEVLRecipe>(
+        .Case<VPWidenLoadEVLRecipe, VPWidenFFLoadRecipe,
+              VPVectorEndPointerRecipe, VPInterleaveEVLRecipe>(
             [&](const VPRecipeBase *R) { return VerifyEVLUse(*R, 1); })
         .Case<VPInstructionWithType>(
             [&](const VPInstructionWithType *S) { return VerifyEVLUse(*S, 0); })
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/find.ll b/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
new file mode 100644
index 0000000000000..f734bd5f53c82
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
@@ -0...
[truncated]

arcbbb · 2025-09-21T02:22:06Z

Rebased after #152422 landed, and updated the PR description as well as title.

Split out from llvm#151300 to isolate TargetTransformInfo cost modelling for fault-only-first loads from VPlan implementation details. This change adds costing support for vp.load.ff independently of the VPlan work.

Split out from #151300 to isolate TargetTransformInfo cost modelling for fault-only-first loads from VPlan implementation details. This change adds costing support for vp.load.ff independently of the VPlan work. For now, model a vp.load.ff as cost-equivalent to a vp.load.

arcbbb · 2025-09-27T00:55:23Z

Gentle nudge. Any thoughts?

rzinsly · 2025-09-29T14:35:02Z

Hi, I wanted to try this patch but the new test is failing for a Release build:

******************** TEST 'LLVM :: Transforms/LoopVectorize/RISCV/find.ll' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 2
/home/rzinsly/build/llvm/bin/opt -passes=loop-vectorize -enable-early-exit-with-ffload -mtriple=riscv64 -mattr=+v -S /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll | /home/rzinsly/build/llvm/bin/FileCheck /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
# executed command: /home/rzinsly/build/llvm/bin/opt -passes=loop-vectorize -enable-early-exit-with-ffload -mtriple=riscv64 -mattr=+v -S /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
# .---command stderr------------
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
# | Stack dump:
# | 0.  Program arguments: /home/rzinsly/build/llvm/bin/opt -passes=loop-vectorize -enable-early-exit-with-ffload -mtriple=riscv64 -mattr=+v -S /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
# | 1.  Running pass "function(loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>)" on module "/home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll"
# | 2.  Running pass "loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>" on function "find_with_liveout"
# |  #0 0x00005a92b002fe62 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/rzinsly/build/llvm/bin/opt+0x2f67e62)
# |  #1 0x00005a92b002cda2 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
# |  #2 0x0000736eee245330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330)
# |  #3 0x00005a92ae0d7df4 llvm::VPValue::getDefiningRecipe() (/home/rzinsly/build/llvm/bin/opt+0x100fdf4)
# |  #4 0x00005a92ae1436c8 llvm::VPlanTransforms::adjustFFLoadEarlyExitForPoisonSafety(llvm::VPlan&) (/home/rzinsly/build/llvm/bin/opt+0x107b6c8)
# |  #5 0x00005a92adf8422a llvm::LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(std::unique_ptr<llvm::VPlan, std::default_delete<llvm::VPlan>>, llvm::VFRange&, llvm::LoopVersioning*) (/home/rzinsly/build/llvm/bin/opt+0xebc22a)
# |  #6 0x00005a92adf85c79 llvm::LoopVectorizationPlanner::buildVPlansWithVPRecipes(llvm::ElementCount, llvm::ElementCount) (/home/rzinsly/build/llvm/bin/opt+0xebdc79)
# |  #7 0x00005a92adf861bd llvm::LoopVectorizationPlanner::plan(llvm::ElementCount, unsigned int) (/home/rzinsly/build/llvm/bin/opt+0xebe1bd)
# |  #8 0x00005a92adf8bae4 llvm::LoopVectorizePass::processLoop(llvm::Loop*) (/home/rzinsly/build/llvm/bin/opt+0xec3ae4)
...

I’m building with:
-DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD="RISCV" -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_PARALLEL_LINK_JOBS=1 -DLLVM_OPTIMIZED_TABLEGEN=ON

It works with -DCMAKE_BUILD_TYPE=Debug.

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

arcbbb · 2025-10-03T03:01:36Z

Hi, I wanted to try this patch but the new test is failing for a Release build:

******************** TEST 'LLVM :: Transforms/LoopVectorize/RISCV/find.ll' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 2
/home/rzinsly/build/llvm/bin/opt -passes=loop-vectorize -enable-early-exit-with-ffload -mtriple=riscv64 -mattr=+v -S /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll | /home/rzinsly/build/llvm/bin/FileCheck /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
# executed command: /home/rzinsly/build/llvm/bin/opt -passes=loop-vectorize -enable-early-exit-with-ffload -mtriple=riscv64 -mattr=+v -S /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
# .---command stderr------------
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
# | Stack dump:
# | 0.  Program arguments: /home/rzinsly/build/llvm/bin/opt -passes=loop-vectorize -enable-early-exit-with-ffload -mtriple=riscv64 -mattr=+v -S /home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll
# | 1.  Running pass "function(loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>)" on module "/home/rzinsly/src/llvm-project/llvm/test/Transforms/LoopVectorize/RISCV/find.ll"
# | 2.  Running pass "loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>" on function "find_with_liveout"
# |  #0 0x00005a92b002fe62 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/rzinsly/build/llvm/bin/opt+0x2f67e62)
# |  #1 0x00005a92b002cda2 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
# |  #2 0x0000736eee245330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330)
# |  #3 0x00005a92ae0d7df4 llvm::VPValue::getDefiningRecipe() (/home/rzinsly/build/llvm/bin/opt+0x100fdf4)
# |  #4 0x00005a92ae1436c8 llvm::VPlanTransforms::adjustFFLoadEarlyExitForPoisonSafety(llvm::VPlan&) (/home/rzinsly/build/llvm/bin/opt+0x107b6c8)
# |  #5 0x00005a92adf8422a llvm::LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(std::unique_ptr<llvm::VPlan, std::default_delete<llvm::VPlan>>, llvm::VFRange&, llvm::LoopVersioning*) (/home/rzinsly/build/llvm/bin/opt+0xebc22a)
# |  #6 0x00005a92adf85c79 llvm::LoopVectorizationPlanner::buildVPlansWithVPRecipes(llvm::ElementCount, llvm::ElementCount) (/home/rzinsly/build/llvm/bin/opt+0xebdc79)
# |  #7 0x00005a92adf861bd llvm::LoopVectorizationPlanner::plan(llvm::ElementCount, unsigned int) (/home/rzinsly/build/llvm/bin/opt+0xebe1bd)
# |  #8 0x00005a92adf8bae4 llvm::LoopVectorizePass::processLoop(llvm::Loop*) (/home/rzinsly/build/llvm/bin/opt+0xec3ae4)
...

I’m building with: -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD="RISCV" -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_PARALLEL_LINK_JOBS=1 -DLLVM_OPTIMIZED_TABLEGEN=ON

It works with -DCMAKE_BUILD_TYPE=Debug.

Thanks for the report! I missed this when testing with a Release build +LLVM_ENABLE_ASSERTIONS=ON.
I’ve pushed a fix in the latest commit

Split out from llvm#151300 to isolate TargetTransformInfo cost modelling for fault-only-first loads from VPlan implementation details. This change adds costing support for vp.load.ff independently of the VPlan work. For now, model a vp.load.ff as cost-equivalent to a vp.load.

llvm/test/Transforms/LoopVectorize/RISCV/find.ll

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

llvm/lib/Transforms/Vectorize/VPlan.h

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

llvm/test/Transforms/LoopVectorize/RISCV/find.ll

arcbbb · 2025-10-21T15:45:21Z

Latest update addressed review comments:

Remove the dedicated recipe for fault-only-first loads; handle struct return types generically.
Replace ReductionEVL with masked AnyOf.
Add VPlan printing test.

1. Use (AnyOf (logical-and %aml, %cond)) instead of VPReductionEVLRecipe 2. Replace WidenFFLoadRecipe with WidenIntrinsicRecipe 3. Hanle StructType for WidenIntrinsicRecipe

arcbbb requested review from alexey-bataev, david-arm and fhahn July 30, 2025 09:54

Meinersbur mentioned this pull request Jul 30, 2025

[MLIR][OpenMP] Add canonical loop LLVM-IR lowering #147069

Merged

arcbbb requested review from Mel-Chen and lukel97 July 30, 2025 10:04

david-arm requested a review from huntergr-arm July 30, 2025 10:08

arcbbb mentioned this pull request Jul 30, 2025

[VP][RISCV] Add a vp.load.ff intrinsic for fault only first load. #128593

Merged

alexey-bataev reviewed Jul 30, 2025

View reviewed changes

arcbbb changed the title ~~[LV] Add support for speculative loads in loops that may fault~~ [RFC][LV] Add support for speculative loads in loops that may fault Jul 31, 2025

arcbbb mentioned this pull request Aug 7, 2025

[LV] Add initial legality checks for loops with unbound loads. #152422

Merged

arcbbb force-pushed the early-exit-dev branch from d044e7a to 1bb79f8 Compare August 20, 2025 07:54

lukel97 reviewed Sep 2, 2025

View reviewed changes

arcbbb force-pushed the early-exit-dev branch from 1bb79f8 to f8d7616 Compare September 21, 2025 01:14

arcbbb changed the title ~~[RFC][LV] Add support for speculative loads in loops that may fault~~ [VPlan] Enable vectorization of early-exit loops with unit-stride fault-only-first loads Sep 21, 2025

arcbbb force-pushed the early-exit-dev branch from f8d7616 to 1ab67ae Compare September 21, 2025 01:42

arcbbb marked this pull request as ready for review September 21, 2025 01:42

llvmbot added backend:RISC-V vectorizers llvm:transforms labels Sep 21, 2025

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Sep 22, 2025

arcbbb force-pushed the early-exit-dev branch from b6add27 to 755ad37 Compare September 22, 2025 07:23

arcbbb mentioned this pull request Sep 24, 2025

[TTI][RISCV] Add cost modelling for intrinsic vp.load.ff #160470

Merged

arcbbb force-pushed the early-exit-dev branch 2 times, most recently from e4051e2 to 02f8262 Compare September 27, 2025 00:52

rzinsly reviewed Sep 29, 2025

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp Outdated Show resolved Hide resolved

fhahn reviewed Oct 6, 2025

View reviewed changes

arcbbb added 4 commits October 26, 2025 23:29

Support WidenFFLoad in early-exit loop

bc2a814

Fix crash in release build

a3bf7f2

Address comments

99ea306

1. Use (AnyOf (logical-and %aml, %cond)) instead of VPReductionEVLRecipe 2. Replace WidenFFLoadRecipe with WidenIntrinsicRecipe 3. Hanle StructType for WidenIntrinsicRecipe

Fix error after merge

deb51fd

arcbbb force-pushed the early-exit-dev branch from 4e8d2e9 to deb51fd Compare October 27, 2025 06:46

Fix regression in test change

f2b9253

arcbbb mentioned this pull request Oct 27, 2025

[VPlan] Support struct return types for widen intrinsics (NFC). #165218

Open

Uh oh!

[VPlan] Enable vectorization of early-exit loops with unit-stride fault-only-first loads #151300

Are you sure you want to change the base?

[VPlan] Enable vectorization of early-exit loops with unit-stride fault-only-first loads #151300

Conversation

arcbbb commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arcbbb commented Jul 30, 2025

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexey-bataev Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

arcbbb Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

david-arm commented Jul 30, 2025

Uh oh!

arcbbb commented Aug 1, 2025

Uh oh!

lukel97 Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

arcbbb Sep 21, 2025

Choose a reason for hiding this comment

Uh oh!

llvmbot commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 21, 2025

Uh oh!

arcbbb commented Sep 21, 2025

Uh oh!

arcbbb commented Sep 27, 2025

Uh oh!

rzinsly commented Sep 29, 2025

Uh oh!

Uh oh!

arcbbb commented Oct 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arcbbb commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

arcbbb commented Jul 30, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading

llvmbot commented Sep 21, 2025 •

edited

Loading